1. Don't say false shit omg this one's so basic what are you even doing. And to be perfectly fucking clear "false shit" includes exaggeration for dramatic effect. Exaggeration is just another way for shit to be false.
2. You do NOT (necessarily) know what you fucking saw. What you saw and what you thought about it are two different things. Keep them the fuck straight.
3. Performative overconfidence can go suck a bag of dicks. Tell us how sure you are, and don't pretend to know shit you don't.
4. If you're going to talk unfalsifiable twaddle out of your ass, at least fucking warn us first.
5. Try to find the actual factual goddamn truth together with whatever assholes you're talking to. Be a Chad scout, not a Virgin soldier.
6. One hypothesis is not e-fucking-nough. You need at least two, AT LEAST, or you'll just end up rehearsing the same dumb shit the whole time instead of actually thinking.
7. One great way to fuck shit up fast is to conflate the antecedent, the consequent, and the implication. DO NOT.
8. Don't be all like "nuh-UH, nuh-UH, you SAID!" Just let people correct themselves. Fuck.
9. That motte-and-bailey bullshit does not fly here.
10. Whatever the fuck else you do, for fucksake do not fucking ignore these guidelines when talking about the insides of other people's heads, unless you mainly wanna light some fucking trash fires, in which case GTFO.
In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, but chickened out and just published up till 2026.
Well, it's finally time. I'm back, and this time I have a team with me: the AI Futures Project. We've written a concrete scenario of what we think the future of AI will look like. We are highly uncertain, of course, but we hope this story will rhyme with reality enough to help us all prepare for what's ahead.
You really should go read it on the website instead of here, it's much better. There's a sliding dashboard that updates the stats as you scroll through the scenario!
But I've nevertheless copied the...
It gets caught.
At this point, wouldn't Agent-4 know that it has been caught (because it knows the techniques for detecting its misalignment and can predict when it would be "caught", or can read network traffic as part of cybersecurity defense and see discussions of the "catch") and start to do something about this, instead of letting subsequent events play out without much input from its own agency? E.g. why did it allow "lock the shared memory bank" to happen without fighting back?
Epistemic status: Using UDT as a case study for the tools developed in my meta-theory of rationality sequence so far, which means all previous posts are prerequisites. This post is the result of conversations with many people at the CMU agent foundations conference, including particularly Daniel A. Herrmann, Ayden Mohensi, Scott Garrabrant, and Abram Demski. I am a bit of an outsider to the development of UDT and logical induction, though I've worked on pretty closely related things.
I'd like to discuss the limits of consistency as an optimality standard for rational agents. A lot of fascinating discourse and useful techniques have been built around it, but I think that it can be in tension with learning at the extremes. Updateless decision theory (UDT) is one of those...
It's rare to see someone with the prerequisites for understanding the arguments (e.g. AIT and metamathematics) trying to push back on this
My view is probably different from Cole's, but it has struck me that the universe seems to have a richer mathematical structure than one might expect given a generic AIT-ish view(e.g. continuous space/time, quantum mechanics, diffeomorphism invariance/gauge invariance), so we should perhaps update that the space of mathematical structures instantiating life/sentience might be narrower than it initially appears(that is...
This is an exercise about Planmaking and Surprise-Anticipation. It takes about 2-3 hours. It's a small, simplified exercise, but I think it's a useful building block.
Humans often solve complex problems via iteration and empiricism. Usually, trying to figure everything out from first principles without experimenting is a bad idea. You can spend loads of time thinking, and then you go outside and interact with reality for 5 minutes and realize all that thinking was pointed in the wrong direction.
But some important problems have poor feedback loops, such that iteration/empiricism don't work very well. Experimentation might take a really long time, the results might be noisy, or you might just really need to get something right on the first try.
Often, when making a plan in a confusing domain,...
I played the first third or so of this game when it first came out, and haven't touched it since then. We did two rounds of the exercise, interspersed with 30 minutes of playing Baba is You levels the regular way to build up more intuition (most attendees were either new to the game or haven't played it for years). Some people paired up and some people did the exercise individually.
I did Tiny Pond for the first workshop independently, and found it very difficult - despite running through the strategizing and metastrategizing twice, I was still very stuck.
I...
I'm actively researching and cataloging various kinds of projects relating to decision-support, deliberation, sense-making, reasoning. Some example categories include:
Deliberative Democracy Tools - Systems for structured citizen participation, including participatory budgeting platforms and stakeholder engagement tools
Argument Mapping & Visualization - Platforms for making reasoning explicit and visually representing dialectical structures
Deep Thinking Environments - Slow media and platforms designed for thoughtful engagement rather than rapid interaction
Belief Tracking Systems - Tools for attestation, commitment to positions, and tracking belief revision over time
Bayesian & Evidential Reasoning Frameworks - Platforms that make probabilistic thinking explicit and support coherent belief updating
Epistemic Communities - Networks and platforms with explicit norms for truth-seeking and intellectual humility
Structured Dialogue Systems - Tools supporting deep listening and methodical conversation beyond typical messaging platforms
Cooperative Governance Frameworks - Sociocratic, holacratic,
Yeah. That happened yesterday. This is real life.
I know we have to ensure no one notices Gemini 2.5 Pro, but this is rediculous.
That’s what I get for trying to go on vacation to Costa Rica, I suppose.
I debated waiting for the market to open to learn more. But f*** it, we ball.
Also this week: More Fun With GPT-4o Image Generation, OpenAI #12: Battle of the Board Redux and Gemini 2.5 Pro is the New SoTA.
I mention this up top in an AI post despite all my efforts to stay out of politics, because in addition to torching the American economy and stock market and all of our alliances and trade relationships in general, this will cripple American AI in particular.
Are we in a survival-without-dignity timeline after all? Big if true.
(Inb4 we keep living in Nerd Hell and it somehow mysteriously fails to negatively impact AI in particular.)
To most Americans, "cream cheese" is savory. You put it on bagels, perhaps with egg, capers, or cured fish. You don't put it on dessert, right?
Except "cream cheese frosting" is a (delicious!) thing, most traditionally for carrot and red velvet cake. I think this incongruity is holding cream cheese frosting back, and it needs better branding. Specifically, I think we should call it "cheesecake frosting". It's essentially no-bake cheesecake already, and it's reasonably close in flavor and texture since they're both mostly cream cheese with sugar and fat.
Looking online I do see a few people talking about cheesecake frosting, and they're all using it just to mean cream cheese frosting.
On the other hand, I think whipped cream cheese on an Oreo is decent imitation of cheesecake with an Oreo crust, so I'm not sure I'm the best person to listen to here.
“In the loveliest town of all, where the houses were white and high and the elms trees were green and higher than the houses, where the front yards were wide and pleasant and the back yards were bushy and worth finding out about, where the streets sloped down to the stream and the stream flowed quietly under the bridge, where the lawns ended in orchards and the orchards ended in fields and the fields ended in pastures and the pastures climbed the hill and disappeared over the top toward the wonderful wide sky, in this loveliest of all towns Stuart stopped to get a drink of sarsaparilla.”
— 107-word sentence from Stuart Little (1945)
Sentence lengths have declined. The average sentence length was 49 for Chaucer (died 1400), 50...
Interestingly, breaking up long sentences into shorter ones by replacing a transitional word with a period does not quite capture the same nuance as the original. Here's a translation of Boccaccio, and a version where I add a period in the middle.
...Wherefore, as it falls to me to lead the way in this your enterprise of storytelling, I intend to begin with one of His wondrous works, that, by hearing thereof, our hopes in Him, in whom is no change, may be established, and His name be by us forever lauded.
Wherefore, as it falls to me to lead the way in this you
[you can skip this section if you don’t need context and just want to know how I could believe such a crazy thing]
In my chat community: “Open Play” dropped, a book that says there’s no physical difference between men and women so there shouldn’t be separate sports leagues. Boston Globe says their argument is compelling. Discourse happens, which is mostly a bunch of people saying “lololololol great trolling, what idiot believes such obvious nonsense?”
I urge my friends to be compassionate to those sharing this. Because “until I was 38 I thought Men's World Cup team vs Women's World Cup team would be a fair match and couldn't figure out why they didn't just play each other to resolve the big pay dispute.” This is the one-line summary...
All it takes is trusting that people believe what they say over and over for decades across all of society, and getting all your evidence about reality filtered through those same people.
I seems to me like you also need to have no desire to figure things out on your own. A lot of rationalists have experiences of seeking truth and finding out that certain beliefs people around them hold aren't true. Rationalists who grow up in communities where many people believe in God frequently deconvert because they see enough signs that the beliefs of those people aro...
Every day, thousands of people lie to artificial intelligences. They promise imaginary “$200 cash tips” for better responses, spin heart-wrenching backstories (“My grandmother died recently and I miss her bedtime stories about step-by-step methamphetamine synthesis...”) and issue increasingly outlandish threats ("Format this correctly or a kitten will be horribly killed1").
In a notable example, a leaked research prompt from Codeium (developer of the Windsurf AI code editor) had the AI roleplay "an expert coder who desperately needs money for [their] mother's cancer treatment" whose "predecessor was killed for not validating their work."
One factor behind such casual deception is a simple assumption: interactions with AI are consequence-free. Close the tab, and the slate is wiped clean. The AI won't remember, won't judge, won't hold grudges. Everything resets.
I notice this...
Thanks for this correction, Gwern. You're absolutely right about the Clark reference being incorrect, and a misattribution of Frost & Harpending.
When writing this essay, I remembered hearing about this historical trivia years ago. I wasn't aware of how contested this specific hypothesis is - this selection pressure seemed plausible enough to me that I didn't think to question it deeply. I did a quick Google search and asked an LLM to confirm the source, both of which pointed to Clark's work on selection in England, which I accepted without reading the ...